Programming paradigms |
---|
|
Object-oriented programming (OOP) is a programming paradigm that uses "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, modularity, polymorphism, and inheritance. Many modern programming languages now support OOP.
Contents |
An object is a discrete bundle of functions and procedures, often relating to a particular real-world concept such as a bank account holder or hockey player. Other pieces of software can access the object only by calling its functions and procedures that have been allowed to be called by outsiders. A large number of software engineers agree that isolating objects in this way makes their software easier to manage and keep track of. However, a significant number of engineers feel the reverse may be true: that software becomes more complex to maintain and document, or even to engineer from the start. The conditions under which OOP prevails over alternative techniques (and vice-versa) often remain unstated by either party, however, making rational discussion of the topic difficult, and often leading to "religious wars" over the matter.
Object-oriented programming has roots that can be traced to the 1960s. As hardware and software became increasingly complex, manageability often became a concern. Researchers studied ways to maintain software quality and developed object-oriented programming in part to address common problems by strongly emphasizing discrete, reusable units of programming logic. The technology focuses on data rather than processes, with programs composed of self-sufficient modules ("classes"), each instance of which ("objects") contains all the information needed to manipulate its own data structure ("members"). This is in contrast to the existing modular programming which had been dominant for many years that focused on the function of a module, rather than specifically the data, but equally provided for code reuse, and self-sufficient reusable units of programming logic, enabling collaboration through the use of linked modules (subroutines). This more conventional approach, which still persists, tends to consider data and behavior separately.
An object-oriented program may thus be viewed as a collection of interacting objects, as opposed to the conventional model, in which a program is seen as a list of tasks (subroutines) to perform. In OOP, each object is capable of receiving messages, processing data, and sending messages to other objects. Each object can be viewed as an independent 'machine' with a distinct role or responsibility. The actions (or "methods") on these objects are closely associated with the object. For example, the data structures tend to 'carry their own operators around with them' (or at least "inherit" them from a similar object or class). In the conventional model, the data and operations on the data don't have a tight, formal association.
The terms "objects" and "oriented" in something like the modern sense of object-oriented programming seem to make their first appearance at MIT in the late 1950s and early 1960s. In the milieu of the artificial intelligence group, as early as 1960, "object" could refer to identified items (LISP atoms) with properties (attributes);[1][2] Alan Kay was later to cite a detailed understanding of LISP internals as a strong influence on his thinking in 1966.[3] Another early MIT example was Sketchpad created by Ivan Sutherland in 1960-61; in the glossary of the 1963 technical report based on his dissertation about Sketchpad, Sutherland defined notions of "object" and "instance" (with the class concept covered by "master" or "definition"), albeit specialized to graphical interaction.[4] Also, an MIT Algol version, AED-0, linked data structures ("plexes", in that dialect) directly with procedures, prefiguring what were later termed "messages", "methods" and "member functions".[5][6]
Objects as a formal concept in programming were introduced in the 1960s in Simula 67, a major revision of Simula I, a programming language designed for discrete event simulation, created by Ole-Johan Dahl and Kristen Nygaard of the Norwegian Computing Center in Oslo.[7] Simula 67 was influenced by SIMSCRIPT and Hoare's proposed "record classes".[5][8] Simula introduced the notion of classes and instances or objects (as well as subclasses, virtual methods, coroutines, and discrete event simulation) as part of an explicit programming paradigm. The language also used automatic garbage collection which had been invented earlier for the functional programming language Lisp. Simula was used for physical modeling, such as models to study and improve the movement of ships and their content through cargo ports. The ideas of Simula 67 influenced many later languages, including Smalltalk, derivatives of LISP (CLOS), Object Pascal, and C++.
The Smalltalk language, which was developed at Xerox PARC (by Alan Kay and others) in the 1970s, introduced the term object-oriented programming to represent the pervasive use of objects and messages as the basis for computation. Smalltalk creators were influenced by the ideas introduced in Simula 67, but Smalltalk was designed to be a fully dynamic system in which classes could be created and modified dynamically rather than statically as in Simula 67.[9] Smalltalk and with it OOP were introduced to a wider audience by the August 1981 issue of Byte magazine.
In the 1970s, Kay's Smalltalk work had influenced the Lisp community to incorporate object-based techniques which were introduced to developers via the Lisp machine. Experimentation with various extensions to Lisp (like LOOPS and Flavors introducing multiple inheritance and mixins), eventually led to the Common Lisp Object System (CLOS, a part of the first standardized object-oriented programming language, ANSI Common Lisp), which integrates functional programming and object-oriented programming and allows extension via a Meta-object protocol. In the 1980s, there were a few attempts to design processor architectures which included hardware support for objects in memory but these were not successful. Examples include the Intel iAPX 432 and the Linn Smart Rekursiv.
Object-oriented programming developed as the dominant programming methodology in the early and mid 1990's when programming languages supporting the techniques became widely available. These included Visual FoxPro 3.0,[10][11][12] C++, and Delphi. Its dominance was further enhanced by the rising popularity of graphical user interfaces, which rely heavily upon object-oriented programming techniques. An example of a closely related dynamic GUI library and OOP language can be found in the Cocoa frameworks on Mac OS X, written in Objective-C, an object-oriented, dynamic messaging extension to C based on Smalltalk. OOP toolkits also enhanced the popularity of event-driven programming (although this concept is not limited to OOP). Some feel that association with GUIs (real or perceived) was what propelled OOP into the programming mainstream.
At ETH Zürich, Niklaus Wirth and his colleagues had also been investigating such topics as data abstraction and modular programming (although this had been in common use in the 1960s or earlier). Modula-2 (1978) included both, and their succeeding design, Oberon, included a distinctive approach to object orientation, classes, and such. The approach is unlike Smalltalk, and very unlike C++.
Object-oriented features have been added to many existing languages during that time, including Ada, BASIC, Fortran, Pascal, and others. Adding these features to languages that were not initially designed for them often led to problems with compatibility and maintainability of code.
More recently, a number of languages have emerged that are primarily object-oriented yet compatible with procedural methodology, such as Python and Ruby. Probably the most commercially important recent object-oriented languages are Visual Basic.NET (VB.NET) and C#, both designed for Microsoft's .NET platform, and Java, developed by Sun Microsystems. Both frameworks show the benefit of using OOP by creating an abstraction from implementation in their own way. VB.NET and C# support cross-language inheritance, allowing classes defined in one language to subclass classes defined in the other language. Java runs in a virtual machine, making it possible to run on all different operating systems. VB.NET and C# make use of the Strategy pattern to accomplish cross-language inheritance, whereas Java makes use of the Adapter pattern.
Just as procedural programming led to refinements of techniques such as structured programming, modern object-oriented software design methods include refinements such as the use of design patterns, design by contract, and modeling languages (such as UML).
A survey by Deborah J. Armstrong of nearly 40 years of computing literature identified a number of "quarks", or fundamental concepts, found in the strong majority of definitions of OOP.[13]
Not all of these concepts are to be found in all object-oriented programming languages, and so object-oriented programming that uses classes is called sometimes class-based programming. In particular, prototype-based programming does not typically use classes. As a result, a significantly different yet analogous terminology is used to define the concepts of object and instance.
Benjamin Cuire Pierce and some other researchers view as futile any attempt to distill OOP to a minimal set of features. He nonetheless identifies fundamental features that support the OOP programming style in most object-oriented languages:[14]
this
or self
, that allows a method body to invoke another method body of the same object. This variable is late-bound; it allows a method defined in one class to invoke another method that is defined later, in some subclass thereof.Similarly, in his 2003 book, Concepts in programming languages, John C. Mitchell identifies four main features: dynamic dispatch, abstraction, subtype polymorphism, and inheritance.[15] Michael Lee Scott in Programming Language Pragmatics considers only encapsulation, inheritance and dynamic dispatch.[16]
A Class is a user defined datatype which contains the variables, properties and methods in it. A class defines the abstract characteristics of a thing (object), including its characteristics (its attributes, fields or properties) and the thing's behaviors (the things it can do, or methods, operations or features). One might say that a class is a blueprint or factory that describes the nature of something. For example, the class Dog
would consist of traits shared by all dogs, such as breed and fur color (characteristics), and the ability to bark and sit (behaviors). Classes provide modularity and structure in an object-oriented computer program. A class should typically be recognizable to a non-programmer familiar with the problem domain, meaning that the characteristics of the class should make sense in context. Also, the code for a class should be relatively self-contained (generally using encapsulation). Collectively, the properties and methods defined by a class are called members.
One can have an instance of a class; the instance is the actual object created at run-time. In programmer vernacular,the Lassie
object is an instance of the Dog
class. The set of values of the attributes of a particular object is called its state. The object consists of state and the behavior that's defined in the object's classes
Method is a set of procedural statements for achieving the desired result. It performs different kinds of operations on different data types. In a programming language, methods (sometimes referred to as "functions") are verbs. Lassie
, being a Dog
, has the ability to bark. So bark()
is one of Lassie
's methods. She may have other methods as well, for example sit()
or eat()
or walk()
or save(Timmy)
. Within the program, using a method usually affects only one particular object; all Dog
s can bark, but you need only one particular dog to do the barking.
"The process by which an object sends data to another object or asks the other object to invoke a method."[13] Also known to some programming languages as interfacing. For example, the object called Breeder
may tell the Lassie
object to sit by passing a "sit" message which invokes Lassie's "sit" method. The syntax varies between languages, for example: [Lassie sit]
in Objective-C. In Java, code-level message passing corresponds to "method calling". Some dynamic languages use double-dispatch or multi-dispatch to find and pass messages.
Inheritance is a process in which a class inherits all the state and behavior of another class. this type of relationship is called child-Parent or is-a relationship. "Subclasses" are more specialized versions of a class, which inherit attributes and behaviors from their parent classes, and can introduce their own.
For example, the class Dog
might have sub-classes called Collie
, Chihuahua
, and GoldenRetriever
. In this case, Lassie
would be an instance of the Collie
subclass. Suppose the Dog
class defines a method called bark()
and a property called furColor
. Each of its sub-classes (Collie
, Chihuahua
, and GoldenRetriever
) will inherit these members, meaning that the programmer only needs to write the code for them once.
Each subclass can alter its inherited traits. For example, the Collie
subclass might specify that the default furColor
for a collie is brown-and-white. The Chihuahua
subclass might specify that the bark()
method produces a high pitch by default. Subclasses can also add new members. The Chihuahua
subclass could add a method called tremble()
. So an individual chihuahua instance would use a high-pitched bark()
from the Chihuahua
subclass, which in turn inherited the usual bark()
from Dog
. The chihuahua object would also have the tremble()
method, but Lassie
would not, because she is a Collie
, not a Chihuahua
. In fact, inheritance is an "a... is a" relationship between classes, while instantiation is an "is a" relationship between an object and a class: a Collie
is a Dog
("a... is a"), but Lassie
is a Collie
("is a"). Thus, the object named Lassie
has the methods from both classes Collie
and Dog
.
Multiple inheritance is inheritance from more than one ancestor class, neither of these ancestors being an ancestor of the other. For example, independent classes could define Dog
s and Cat
s, and a Chimera
object could be created from these two which inherits all the (multiple) behavior of cats and dogs. This is not always supported, as it can be hard to implement.
Abstraction is simplifying complex reality by modeling classes appropriate to the problem, and working at the most appropriate level of inheritance for a given aspect of the problem.
For example, Lassie
the Dog
may be treated as a Dog
much of the time, a Collie
when necessary to access Collie
-specific attributes or behaviors, and as an Animal
(perhaps the parent class of Dog
) when counting Timmy's pets.
Encapsulation conceals the functional details of a class from objects that send messages to it.
For example, the Dog
class has a bark()
method variable,data. The code for the bark()
method defines exactly how a bark happens (e.g., by inhale()
and then exhale()
, at a particular pitch and volume). Timmy, Lassie
's friend, however, does not need to know exactly how she barks. Encapsulation is achieved by specifying which classes may use the members of an object. The result is that each object exposes to any class a certain interface — those members accessible to that class. The reason for encapsulation is to prevent clients of an interface from depending on those parts of the implementation that are likely to change in the future, thereby allowing those changes to be made more easily, that is, without changes to clients. For example, an interface can ensure that puppies can only be added to an object of the class Dog
by code in that class. Members are often specified as public, protected or private, determining whether they are available to all classes, sub-classes or only the defining class. Some languages go further: Java uses the default access modifier to restrict access also to classes in the same package, C# and VB.NET reserve some members to classes in the same assembly using keywords internal (C#) or Friend (VB.NET). Eiffel and C++ allow one to specify which classes may access any member.
Polymorphism allows the programmer to treat derived class members just like their parent class's members. More precisely, Polymorphism in object-oriented programming is the ability of objects belonging to different data types to respond to calls of methods of the same name, each one according to an appropriate type-specific behavior. One method, or an operator such as +, -, or *, can be abstractly applied in many different situations. If a Dog
is commanded to speak()
, this may elicit a bark()
. However, if a Pig
is commanded to speak()
, this may elicit an oink()
. Each subclass overrides the speak()
method inherited from the parent class Animal
.
Decoupling allows for the separation of object interactions from classes and inheritance into distinct layers of abstraction. A common use of decoupling is to polymorphically decouple the encapsulation, which is the practice of using reusable code to prevent discrete code modules from interacting with each other. However, in practice decoupling often involves trade-offs with regard to which patterns of change to favor. The science of measuring these trade-offs in respect to actual change in an objective way is still in its infancy.
There have been several attempts at formalizing the concepts used in object-oriented programming. The following concepts and constructs have been used as interpretations of OOP concepts:
Attempts to find a consensus definition or theory behind objects have not proven very successful (however, see Abadi & Cardelli, A Theory of Objects[17] for formal definitions of many OOP concepts and constructs), and often diverge widely. For example, some definitions focus on mental activities, and some on mere program structuring. One of the simpler definitions is that OOP is the act of using "map" data structures or arrays that can contain functions and pointers to other maps, all with some syntactic and scoping sugar on top. Inheritance can be performed by cloning the maps (sometimes called "prototyping"). OBJECT:=>> Objects are the run time entities in an object-oriented system. They may represent a person, a place, a bank account, a table of data or any item that the program has to handle.
Simula (1967) is generally accepted as the first language to have the primary features of an object-oriented language. It was created for making simulation programs, in which what came to be called objects were the most important information representation. Smalltalk (1972 to 1980) is arguably the canonical example, and the one with which much of the theory of object-oriented programming was developed.
In recent years, object-oriented programming has become especially popular in scripting programming languages. Python, Ruby and Groovy are scripting languages built on OOP principles, while Perl and PHP have been adding object oriented features since Perl 5 and PHP 4, and ColdFusion since version 6.
The Document Object Model of HTML, XHTML, and XML documents on the Internet have bindings to the popular JavaScript/ECMAScript language. JavaScript is perhaps the best known prototype-based programming language, which employs cloning from prototypes rather than inheriting from a class. Another scripting language which takes this approach is Lua. Earlier versions of ActionScript (a partial superset of the ECMA-262 R3, otherwise known as ECMAScript) also used a prototype based object model. Later versions of ActionScript incorporate a combination of classification and prototype based object models based largely on the currently incomplete ECMA-262 R4 specification, which has its roots in an early JavaScript 2 Proposal. Microsoft's JScript.NET also includes a mash-up of object models based on the same proposal, and is also a superset of the ECMA-262 R3 specification.
Challenges of object-oriented design are addressed by several methodologies. Most common is known as the design patterns codified by Gamma et al.. More broadly, the term "design patterns" can be used to refer to any general, repeatable solution to a commonly occurring problem in software design. Some of these commonly occurring problems have implications and solutions particular to object-oriented development.
It is intuitive to assume that inheritance creates a semantic "is a" relationship, and thus to infer that objects instantiated from subclasses can always be safely used instead of those instantiated from the superclass. This intuition is unfortunately false in most OOP languages, in particular in all those that allow mutable objects. Subtype polymorphism as enforced by the type checker in OOP languages (with mutable objects) cannot guarantee behavioral subtyping in any context. Behavioral subtyping is undecidable in general, so it cannot be implemented by a program (compiler). Class or object hierarchies need to be carefully designed considering possible incorrect uses that cannot be detected syntactically. This issue is known as the Liskov substitution principle.
Design Patterns: Elements of Reusable Object-Oriented Software is an influential book published in 1995 by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, sometimes casually called the "Gang of Four". Along with exploring the capabilities and pitfalls of object-oriented programming, it describes 23 common programming problems and patterns for solving them. As of April 2007, the book was in its 36th printing.
The book describes the following patterns:
Both object-oriented programming and relational database management systems (RDBMSs) are extremely common in software today[update]. Since relational databases don't store objects directly (though some RDBMSs have object-oriented features to approximate this), there is a general need to bridge the two worlds. The problem of bridging object-oriented programming accesses and data patterns with relational databases is known as Object-Relational impedance mismatch. There are a number of approaches to cope with this problem, but no general solution without downsides[18]. One of the most common approaches is object-relational mapping, as found in libraries like Java Data Objects and Ruby on Rails' ActiveRecord.
There are also object databases which can be used to replace RDBMSs, but these have not been as technically and commercially successful as RDBMSs.
OOP can be used to translate from real-world phenomena to program elements (and vice versa). OOP was even invented for the purpose of physical modeling in the Simula-67 programming language. However, not everyone agrees that direct real-world mapping is facilitated by OOP (see Criticism section), or is even a worthy goal; Bertrand Meyer argues in Object-Oriented Software Construction[19] that a program is not a model of the world but a model of some part of the world; "Reality is a cousin twice removed". At the same time, some principal limitations of OOP had been noted.[20] An example for a real world problem which cannot be modeled elegantly with OOP techniques is the Circle-ellipse problem.
However, Niklaus Wirth (who popularized the adage now known as Wirth's law: "Software is getting slower more rapidly than hardware becomes faster") said of OOP in his paper, "Good Ideas through the Looking Glass", "This paradigm closely reflects the structure of systems 'in the real world', and it is therefore well suited to model complex systems with complex behaviours" (contrast KISS principle).
However, it was also noted (e.g. in Steve Yegge's essay Execution in the Kingdom of Nouns[1]) that the OOP approach of strictly prioritizing things (objects/nouns) before actions (methods/verbs) is an paradigma not found in natural languages.[21] This limitation may lead for some real world modelling to overcomplicated results compared e.g. to procedural approaches.[22]
OOP was developed to increase the reusability and maintainability of source code.[23] Transparent representation of the control flow had no priority and was meant to be handled by a compiler. With the increasing relevance of parallel hardware and multithreaded coding, developer transparent control flow becomes more important, something hard to achieve with OOP.[24][25][26]
Responsibility-driven design defines classes in terms of a contract, that is, a class should be defined around a responsibility and the information that it shares. This is contrasted by Wirfs-Brock and Wilkerson with data-driven design, where classes are defined around the data-structures that must be held. The authors hold that responsibility-driven design is preferable.
According to an article in the Information and Software Technology journal by Alexander Chatzigeorgiou of the Department of Applied Informatics, at the University of Macedonia, the object-oriented approach is known to introduce a significant performance penalty compared to classical procedural programming. For instance, profiling results for embedded applications indicate that C++ programs, apart from being slower than their corresponding C versions, consume significantly more energy (mainly due to the increased instruction count, larger code size and increased number of accesses to the data memory for the object-oriented versions).[27]
Why this is so is partly due to the indirect way that objects are modified or examined (by message passing and methods - rather than by direct assignment); partly through the OO requirement to frequently physically copy data (as parameters lists into messages, rather than merely pointing to existing data) - resulting in extra memory requirements and its corresponding allocation and deallocation, and partly through the use of dynamic dispatch (as opposed to static calls).
A number of well-known researchers and programmers have criticized OOP. Here is an incomplete list:
|
|